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Abstract 

We give an efficient algorithm to generate a graph from a distribution e-close to G{n,p), in 
the sense of total variation distance. In particular, if p is represented with 0(logn)-bit accuracy, 
then, with high probability, the running time is linear in the expected number of edges of the 
output graph (up to poly-logarithmic factors). All our running times include the complexity of 
the arithmetic involved in the corresponding algorithms. Previous standard methods for exact 
G{n,p) sampling (see e.g. [2]) achieve similar running times, however, under the assumption that 
performing real number arithmetic with arbitrary accuracy takes constant time. We note that 
the actual accuracy required by these methods is 0(n)-bit per step, which results in quadratic 
running times. We also note that compromising on the 0(n)-bit accuracy requirement causes 
arbitrary large biases in the sampling. 

The main idea of our G{n,p) generation algorithm is a Metropolis Markov chain to sample 
e-close from the binomial distribution. This is a new method for sampling from the binomial 
distribution: it is of separate interest and may find other useful applications. Our analysis 
accounts for all necessary bit-accuracy and arithmetic. Our running times are comparable to 
known methods for exact binomial sampling (e.g. surveyed in [15]), however, only when the 
latter do not account for bit-accuracy and assume arbitrary bit arithmetic in constant time. 
Dropping these assumptions affects their running times and/or causes bias which has never 
been quantified. In this sense, our work can be viewed as a rigorous quantification of the 
tradeoff between accuracy and running time, when all computational aspects are taken into 
account. 

We further obtain efficient generation algorithms for random graphs with given arbitrary 
degree distributions, Inhomogeneous Random Graphs when the kernel function is the inner 
product, and Stochastic Kronecker Graphs. Efficient generation of these random graph models 
is essential for modeling large scale complex networks. To the best our knowledge, our work 
can be viewed as the first effort to simulate efficient generation of graphs from classical random 
graph models, while taking into account implementational considerations as fundamental com- 
putational aspects, and quantifying the tradeoff between accuracy and running time in a way 
that can be useful in practice. 
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1 Introduction 



Let n > 1 be an integer and let Qn be the set of 2v2j undirected simple graphs on n vertices. 
For < p < 1, we typically think of G{n,p) as a graph in Qn, where each edge {u,v} is present 
with probability p and absent with probability (1— p), independently from all other edges. The 
straightforward way to generate such a graph involves (2) independent experiments. Thus the running 
time is r2(n^). In practice, we need an additional Q{logn) to represent n distinct vertices and an 
additional Q(logp^^) to simulate sampling with probability p resulting in Q (n^(logn + logp"^)) total 
running time. 

If m is the number of edges of the output graph, it is highly desirable to aim for 0{m) running 
times, especially when E[m] << and n is very large. This is true, for example, in the case of 
complex networks. In these cases, the number of vertices n is known to scale massively, while the 
corresponding graphs remain relatively sparse [H El H [121 [12] (for example, due to natural underlying 
resource considerations, as has been discussed extensively in the literature). Taking into account the 
implementational issues mentioned in the previous paragraph and the resource considerations, we 
should aim for O (m(logn + logp~^)) running times. 

Let Wi, . . . ,Wn be non-negative weights on n vertices. Thus we may refer to the WuS as an n- 
dimensional vector w. Let G{n, w) be a graph in where each edge {u, v} is present with probability 
min{wuWv, 1}, independently from all other edges. Notice that G{n,p) is a special case of G{n,w), 
where vju = yfp for all u. Moreover, Gin^ w) subsumes the case of random graphs with given expected 
degrees [HI [9l [H [IH |2ll [22] . 

Let d > 1 be an integer and let Wi, . . . ,Wn be vectors in d-dimensional real space with non- 
negative coordinates: Wuk > 0, VI < m < , VI < < c?. We may thus refer to the WuS using an 
n X d matrix W. Let G{n, W) be a graph in Qn, where each edge {u,v} is present with probabihty 
mm{{wu,Wv) , 1}, independently from all other edges, and where {wu,Wv) is the usual inner product 
{wu,Wy) = Yl'i=i'^ukWvk- Notice that G{n^w) is a special case of G(n, VT), where 0?= 1. Moreover, 
G(n, W) is a special case of Inhomogeneous Random Graphs [5l [6], where the kernel function is the 
inner product. Random inner product graphs have been also studied in [26l [27] . In the context of 
complex networks, the interpretation of vertices represented as d- dimensional vectors is natural. Real 
datasets are categorical. Therefore, each dimension represents a distinct attribute, and data points 
connect with probabilities related to their similarity according to such attributes. 

Let k,d > 1 be integers, and let V he a d x d initiator matrix. Define recursively the matrix 
Kk = V ® Kk~ii where Kq = I and ® is the Kronecker product of matrices ^7\. For n = d^, 
let G{n,V) be a graph in Qn, where each edge {u,v} is present with probability mm{Kk{u,v),l}, 
independently from all other edges. G{n,V) is known in the literature as Stochastic Kronecker 
Graphs [Ta[T6l[I71[T8l[2l]. 

In the context of complex networks, the model of random graphs with given expected degrees 
G{n, w), the model of Inhomogeneous Random Graphs where the kernel function is the inner product 
G{n, W), and the Stochastic Kronecker model G{n,V) have been used to produce synthetic graphs 
that capture important structural properties of real networks. Thus, efficient algorithms to generate 
such random graphs are important both in theory and in practice. 

Let TT be a probability distribution on Qn- Random graph models, such as G{n,p), G{n,w), 
G{n,W), and G{n,V) defined above, are equivalent to such distributions vr over Qn- Moreover, 
properties of random graphs in such models are typically expressed as holding with high probability. 
This means that, for some constant c> 0, the subset ^bad of graphs in Qn for which the property 
does not hold has vr(^BAD) < [H [T2]. Such quantification is of fundamental predictive value. 
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both in theory and apphcations. 

Let tt' be a probabihty distribution on Qn that is e-close to vr, in the usual sense of variation 
distance: max-^cg„ l^'(^) ~ ^(^)l ^ ^- the context of large scale complex networks it is typical 
to check experimentally for various desired (or not desired) properties of graphs generated according 
to a target distribution vr. If generation was done according to vr' that is provably e-close to vr, and 
if the estimate was vr'(^BAD) < eo, we may readily infer that 7r(^BAD) < eo+e- On the other hand, if 
the estimate was 7r'(^BAD) > eo, we may readily infer that vr(^BAD) > cq — e- Thus, if we are able to 
fine-tune e, generation from vr' becomes also of fundamental predictive value. 

Our algorithms generate graphs from distributions e-close to those implied by G{n,p), G{n,tv), 
G{n,W), and G{n,V) respectively, at the cost of a multiplicative factor 0(loge~^) in the running 
time. This implies that we can set e = n~'^, for any constant c, thus matching typical high probability 
statements for random graphs [11[I2]. For historical reasons, we also mention that e-close sampling 
has been extensively used in theoretical computer science, for example in the context of approximate 
counting via Monte Carlo Markov chain simulation, among others. 

In particular, for G{n,p), the running time is O (/ilognloge~^(logn -|- logp~^)) in expectation 
where /i is the expected number of edges of the output graph and the factor {logn + \ogp~^) accounts 
for representation and arithmetic. We may obtain a high probability upper bound on the running 
time at a cost of an additional O(logn) multiplicative factor. 

Finally, we should note that, for the sake of clarity in presentation, we have not tried to optimize 
the poly- logarithmic factors at all points (especially in Section 5). There are also points where the 
pseudocode might hint to "difficult" arithmetic (such as computing square roots). In such cases, we 
comment right below those pseudocodes, as to how these points can be bypassed. 

The rest of the paper is organized as follows. In Section 1.1 we quantify the claim that known 
algorithms for generating Erdos-Renyi G{n,p) graphs with m output edges run in 0{m) time. We 
note that these methods actually require 0{n) bits of arithmetic accuracy per step, which clearly 
results in 0{mn) running times, when all computational aspects are considered. Hence these methods 
for G{n,p) (and hence all generalizations of G{n,p), e.g. see [22]) are not efficient. In Sections 2, 
3, 4, and 5 we develop efficient algorithms to generate random graphs e-close to G{n,p), G{n,w), 
G{n,W), and G{n,V) respectively. 

1.1 Previous work and Contributions 

The standard reference for generating a graph from G{n,p) with running time 0{m), where m is the 
number of output edges, is [2]. This algorithm was recently extended to G{n, w) in [22]. For G{n,p), 
the idea in [2] is to order vertices and have each vertex decide its "distance" or "jump" to its next 
neighbor according to the ordering, where "jumping" a vertex corresponds to an edge that is not 
present in the final output graph. All vertices bypassed by such jumps will not be processed in the 
corresponding iteration. Thus, the intuition is that the number of actual steps of the algorithm is 
proportional to the number of edges present in the output graph, suggesting a 0{m) running time. 

However, we argue below that the implementation of these "jumps" , which involve the simulation 
of a negative binomial with parameters n and p, immediately introduce the necessity of n-bit accuracy 
per "jump". Therefore, the running time becomes 0{nm). The authors bypassed this problem 
by assuming constant running time for real number arithmetic and representation with arbitrary 
accuracy. If n-bit accuracy is compromised for any poly log n-bit accuracy, then the bias becomes 
immediately arbitrary. 

The algorithm in [2] works as follows. First, the vertices are ordered, and the main observation is 
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that, at any given time during an iteration, the probabihty of generating the next edge after exactly 
k trials is {l—p)^~^p; i.e. waiting times for the edges are geometrically distributed. Let q= l—p; 
to sample waiting times, each positive integer k is assigned an interval C [0, 1) of length q^~^p. 
Realize that 'Yl^=i^^~^P— if ^^e intervals are contiguous starting at 0, then interval Ik ends 

at X]!Li?*~V= ^ — q^- Therefore, the waiting times can be sampled by randomly chosing r G [0, 1) 
and selecting the smallest k for which r < l—q'^. The pseudocode from [2] is included bellow. 

E ^%]V ^l]W i 1; 

while f < n do 

Draw r G [0, 1), uniformly at random; 

while w > V and v < n do 

w ^ w—v; V ^ v + 1; 
ii V < n then E ^ EVJ {-u, -u;}; 
return {E); 

It should be clear now that we need r sampled with accuracy 0{n) bits in order to simulate 
fair sampling from all intervals. If the arithmetic used has accuracy a(n), then all intervals Ik 
for k > a{n) will not be represented in the sampling. Thus, a{n) = o(n) introduces arbitrary 
(unquantified) sampling bias on every step of the algorithm. We also note that the arithmetic in [2] 
involves the computation of discrete logarithms, and this issue has been raised in [25] , who however do 
not offer any solution with quantified performance. In [22], the same issues carried over for G{n, w). 

Furthermore, the authors in [2] list an array of widely used software for random graph generation 
that implement inefficient algorithms. They note that such software provides running times tolerable 
for graphs up to tens of thousands of nodes. We remark that current technology requires synthetic 
data involving much larger number of nodes (e.g. to simulate social networks.) 

Of course, the most natural way to sample G{n,p) is to sample the number of edges m from the 
binomial distribution B{(^^,p), and then choose exactly m out of the (2) edges at random. The 
main idea of our G{n,p) generation algorithm is a Metropolis Markov chain to sample e-close from 
the binomial distribution. Our analysis accounts for all necessary bit-accuracy and arithmetic. Our 
running times are comparable to known methods for exact binomial sampling (e.g., surveyed in [T5]). 
however, only when the latter do not account for bit-accuracy and assume arbitrary bit arithmetic 
in constant time. Dropping these assumptions affects their running times and/or causes bias which 
has never been quantified. Our method to sample from the binomial distribution offers a rigorous 
quantification of the tradeoff between accuracy and running time, when all computational aspects 
are taken in to account. Therefore, it is of separate interest and may find other useful applications. 

Our algorithms for G{n,w), G{n, W), and G{n,V) offer also efficient running times while taking 
into account all computational aspects. To the best of our knowledge, our work can be viewed as 
the first effort to simulate efficient generation of graphs from classical random graph models, while 
(a) taking into account fundamental implementational considerations and (b) quantifying possible 
"errors" in a way that can be useful in practice, i.e. with quantified predictive value. 

2 Efficient Generation e-close to G{n,p) 

In this section vr is the probability distribution over Qn implied by G{n,p). In particular, for any 
specific graph G(\/,E)g^„, 7r(G(\/, E)) =pl^l(l-p)^-l^l, where iV=(^). It is obvious that n{\E\ = 
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k) = (^)p^(l — since there are (^) distinct graphs in Qn with k edges and vr assigns to all 
these graphs the same probability. Thus, if B{N,p) is the binomial distribution with parameters N 
and p, then vrdE'l = k) = B{k; N,p). 

Therefore, a natural two-step approach to generate a graph G(V, E) according to vr is to sample 
\E\ from B{N,p) and generate a random combination of out of possible edges. The second step 
can be implemented in time 0{max{\E\,n}\ogn) (including representation and arithmetic) using, 
for example, the classic algorithm in [3]. Sampling from B{N,p), however, is much more involved, 
and it is analyzed in detail in Subsection 12. 1[ 

In Subsection 12.21 we give algorithm Sample-G(?7,,p, e) which uses the Coupling Markov 
Chain (jlj) to generate and include all implementational aspects (beyond mixing time) that 
result in efficient sampling from a distribution e-close to G{n,p). 



2.1 Markov Chains e-close to Binomial Distributions 

Let B{N,p) be a binomial distribution with parameters TV e N and < p < 1. There are several 
methods to sample from the binomial distribution (for a detailed survey see, e.g. [15]). However, we 
desire a rigorous quantification of the tradeoff between accuracy and running time. Current known 
techniques do not have this feature. 

We design a Markov chain approach to sample from a distribution that is e-close to B{N,p), in 
the sense of variation distance. Let fi = Np; the Markov chains are Metropolis-Hastings random 
walks on a segment of the line around fi ± 0{\/ fi lne~^), with expected coupling times (convergence 
rates) 0(/ilne~^). The Markov chain is defined in ([3]) and the coupling which bounds convergence 
rate is defined in (j4]). Lemma 4 and Theorem 5 establishes convergence and mixing time. 



Let < e < 1, ^ = /i- [/ij, and A > y'4A^ max {p, ^ In (2/e)} In (2/e). If ^ < let fi=[i^\; 

otherwise let fi= \fi\, and define A~=min{A,/2} and A+ = min{A, — Finally, let B/^{N,p) be 
the following probability distribution defined on the integer interval /= — A~, /i+ A"' 



.+1 



Fact 1. 

5^fi(x;iV,p)<e (2) 

X0 



Proof. For p> jj\n{2/e) and 5= ./iMHM < Chernoff bounds suggest. 



Pr 



B{N,p)-fi\ > v/4/iln(2/e)l = Fi [\B{N,p)- fi\ > < 2e"^'' = e 



Similarly, for p<-^ln(2/e) and ,5 = iiB^M^ Chernoff bounds and /i< 4 In (2/e) suggest, 
Ft[\B{N,p)-I2\ > 4 In (2/e)] = Pr [B{N,p) > (1 + 5)/u] < e'^^ < e. 



Therefore, we have FT[\B{N,p)-jj,\ > A] < e and A > y'4iVmax {p, In (2/e)} In (2/e) for all p 
and < e < 1 . Now ([2]) in Fact 1 follows immediately. □ 
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To sample from Ba{N,p) we define a Metropolis-Hastings Markov chain M on the interval 
/ with stationary distribution B^(N,p). The transition probabilities are, 



where the functions a"' 



X, 



t+i 



Xt w.p. 1/2 

Xt + 1 w.p. a^{Xt)/4 
Xt-1 w.p. a-(Xt)/4 

l-a+{Xt) , l-a-(Xt) 



(3) 





Af-fc p 
k+1 1-p 



Xt W.p. 

and «"(■) are defined as usual for Metropolis-Hastings Markov chains, 

1 



if k = fi + A+ 
ii fi<k < fi + A+ 
ii k < fi 



a {k) 



N-k+l p 





if k > fi 

^ a fi>k> fL-A- 



if k 



Fact 2. The range ofa~^{-) and a (■) is [0,1]. 

Proof. This is ensured by the definition of jl and can be verified by elementary calculations. 



□ 



Fact 3. For any starting state (or probability distribution) Xq G /, Xt converges to B^{N,p). 

Proof. It is obvious that Xt is ergodic. Convergence to Ba{N,p) follows by verifying detailed balance 
conditions. The details are in Appendix 1. □ 

To bound the mixing time of M, we define a coupling {Xt, Yt) on / x / and analyze its coupling 
time. The transitions probabilities are. 



{Xt+i, Yt+i) 



' (Xi,yi + i) W.p 

{Xt,Yt-V) w.p 

[Xt,Yt) w.p 

(Xt + l,Ft) w.p 

{Xt-\X) w.p 

{Xt,Yt) w.p 



«+(l^t)/4 
oi-{Yt)l\ 
i-a+(yt) I i-a-(yt) 

4 "T 4 

a+(Xj)/4 
a-(Xi)/4 

l-Q+(Xt) , l-a-(Xi) 



(4) 



4 ' 4 

while 7^ y^. Once Xj = l^, they remain equal for all future times following the transitions in (|3]). 

Lemma 4. For the coupling iXt^Yt) with Xo = fi+A~^ and Yq = jl — A~ , let T = mint{Xt = Yt} be the 
coupling time. Then Xt is distributed according to B^{N,p), for all t >T. 

Proof. Let {Xt,Yt) be the coupling {Xt,Yt) with Xo = yU + A"*" and Yq sampled from the stationary 
distribution Ba{N,p). Thus Yt is distributed according to Ba{N,p), Vt. Notice that Xq = Xq > 
Yq > Yq implies immediately Xt = Xt > Yt > Yt, Vt by the monotonicity of the coupling. Thus, 
if T = mint{Xt = Yt}, then Xt = Xt = Yt = Yt, implying Xt = Yt- Therefore Xt is distributed 
according to Ba{N,p), and Xt is also distributed according to Ba{N,p), Wt >T. □ 



Theorem 5. E [T] is O (A^) and, for any c> 1, Pr[T > 2c\ognE [T]] < n" 
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Proof. First note that L = A++A- = (Xq-Fq) > (Xt-Yt) > andT = mmt{{Xt-Yt) = 0}. Furthermore, 
the definition of the couphng in (j4]) imphes that (X^+i— F^+i) is, 

r ix,-Y,)-i w.p. + ^ 

(Xt+i - Yt+i) = (X, - Y,) + 1 w.p. + (5) 

[ (x,-r,) + o w.p. i_£L^_fii^_qM_f^iiM 

To bound E[T], we introduce a simpler process which converges at least as fast at ([5]). In 
particular, let {at} be any sequence with 0<at<l for all t> 0. Let Zq = 0, and let 

( Zt + 1 w.p. 1(1 + at) 

Zt+i = I mm{Zt -1,0} w.p. i (1 + at) (6) 
[ Zt + w.p. 1(1 

Lemma 6. For some sequence {at}, Zt = L — {Xt — Yt) 

Proof. In order to proof this Lemma, we reduce the characterization of ([5]) to three cases: Xt > Yt > 
jl, Xt > fi > Yt, and Ji> Xt> Yt. In each case, we show that ^ is of the form The details are 
in Appendix 2. □ 

Let A be the set of all sequences in the interval [0, 1]. Then, 

E[r]=E[T = min{(Xt-rt)=0}] < max E[min{Zt = L}] (7) 
* ateA^ 

To bound the right-hand-side of ([7]), we use the following Lemma. 

Lemma 7. max E[mint{Zt = L}] < 2{L + 1)^ 

at E A 

Proof. The proof is a suitable adaptation of the proof of an equivalent statement for random walks 
on the integer line with refiecting barrier at zero. The details are in Appendix 3. □ 

Finally, using the bounds in ([7]) and Lemma 7, we get the upper bound for the expectation of the 
coupling time T, 

E[T]<E mm{Zt= L} < 2{L + 1)^ = 0{A'^) 

For the high probability statement of Theorem 5, Markov's inequality implies Pr [T > 2E[T]] < 
1/2. If we view (pessimistically) the simulation of 2clog?7,E [T] steps of the process {Xt,Yt) as clogn 
independent experiments, each experiment consisting of running 2E [T] steps of the process {Xt, Yt), 
the probability that they all fail gives the bound Pr [T > 2E [T]c log n] < {^Y "^"^ = n~'^. This 
completes the proof of Theorem 5. □ 

2.2 Efficient Implementation for Sampfing e-close to G{n,p) 

We remark some important considerations for implementing Sample-G(?T,,p, e). The algorithm first 
uses standard multiplication algorithms to compute n = Np in 0(log nmax {logn, logp"^}) time. 



To find A > y 4Xmax {p, In (2/e)} In (2/e) efficiently, the algorithm may bound from above each 
term inside this expression by the corresponding smallest power of 2, thus making the computation 
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of the resulting logarithm and square root elementary. This allows for a suitable A to be computed 
in total 0(log^ nmax {logn, logp~^, loge~^}) time. 

Sample-G(n,p, e) subsequently simulates the Coupling Markov Chain (jlj with starting state 
as in Theorem 5 to produce k from B/^{N,p). Each step of the simulation involves the simula- 
tion of a step of the Metropolis-Hastings Markov Chain By the definition of a^(-) and 
«"(■), each step of the simulation of ([3]) can be completed in 0(max {logra, logp~^}), for a total of 
0(lognmax{log?T,, logp~^}) to update (write) Xt. The above, combined with Theorem 5, implies 
that k can be sampled from B^{N,p) in expected time 0(A^log?imax{logr2,logp~^}). 

Notice that = 0(iV max {p, ^ In (2/e)} In (2/e)) = 0(max {/i, 41n (2/e)} In (2/e)). We hence- 
forth make the assumption that /i > 4 In (2/e) (or else, a naive faster algorithm can be used instead), 
thus A^ = O (yuln (2/e)). The total running time, is 0(yulognln (2/e) maxjlogn, logp~^}), includ- 
ing all computations, and the running time exceeds 0(c/i log^ In (2/e) max {log n, logp"^}) with 
probability 0{n~'^) for any c > 1. 

Finally, Sample-G(n,p, e) chooses k out of = edges and outputs G(y,E). This step can 
be implemented in time 0(max{/c,n} logA^) (including representation and arithmetic) using, for 
example, the classic algorithm in [3]. 

Theorem 8. Let ir be the distribution on Qn implied by G{n,p). Algorithm Sample-G(n,p, e) outputs 
G{V, E) G Qn sampled from a distribution vr' on Qn that has total variation distance from ir at most 
e. Moreover, for all G{V, E) EQn, tt' has the following additional properties: 

• TT'{GiV,E))>n{GiV,E)) =^ vr'(G(l^, E)) = ^MMll 

• niG{V,E)) <TrlGlv,E)) ^ niG{V,E)) = 

(implementing the natural Coupling from the Past modification). Also, Algorithm Sample-G(?T,,p, e) 
runs in 0(/ilognln (2/e) maxjlogn, logp~^}) expected running time, including all computations, and 
for any c > 1, the probability that the running time exceeds 0(/xc log^ n In (2/e) max {log ra, log p~^}) 
is 0{n~'^). 

Proof. Follows from Theorem 5, Fact 1, and the description of Sample-G(n,p, e) given above. □ 

Remark. There is a natural analogue to Theorem 8 for bipartite graphs. For integers ni and n2 
and 0<p, e< 1, let G{ni, n2,p) be the random bipartite graph with ni right vertices, n2 left vertices, 
and edge probability between a right vertex and a left vertex p. There is an algorithm Sample- 
G(ni, ?T,2,p, e) which generates G{V,E) G Qni,n2 in running time and with properties completely 
analogous to those stated in Theorem 8. (In particular, all computations follow by replacing n by 

77-1 + 71.2). 

3 Efficient Generation e-close to G{n, w) 

In G{n, w), the input consists of n real numbers tv = wi, W2, ■■■,Wn corresponding to a weight for each 
vertex. The probability of an edge {u,v} is given by mm{wuWy,l}, independently from all other 
edges. Throughout this section, and by analogy to Section 2, = E[|i?|] and tt is the distribution 
over all graphs on Qn according to G{n, w). The main idea of the algorithm is to partition the vertices 
according to their weights, where the weights inside each partition class are within a multiplicative 
factor of 2. 

Let q = max(| log2 max„{w„}|, | logg min„{w„}|). In Phase 1, the algorithm rounds up each Wu to 
the next power of 2, which partitions the vertices into 0{q) classes. Simultaneously, the (2) possible 
edges are partitioned according to the rounded weight of their endpoints. In Phases 2 and 3, the 
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algorithm generates a random subgraph within each edge class independently. We observe that these 
random subgraphs are either in G{n,p) or G{ni, n2,p) for which we may use Sample-G(n,p, e') and 
Sample-G(ni, n2,p, e') of Section [21 for suitable choice of e' that we shall determine. In Phase 4, 
the algorithm normalizes the output graph (usual accept-reject), so that each edge is sampled with 
probability mm{wuWy, 1} instead of the rounded weights. 



i ^ lo£ 



I wiu] 



Sample-G(tyi, W2, Wn, e) 

A; ^ 0; 

%Phase 1: Rounding 
for all vertices u 

w{u) ^ 2ri°S2("'«)l; 

if Q = then + 1 

a^QU {u}; 
e' ^2e/{k{k + l)); 
%Phase 2: G{n,p,e') 
for all {a ^ 0) 

E ^ E U SampIe-G(|C, 
%Phase 3: G{ni,n2,p, e') 
for all {Gi ^ and Gj ^ iD, j > i 

E^ EU Sample-G(|a 
%Phase 4: Normalization 
foreach {e = {u,v} e E) E = E \ {e} w.p. 1 
return (E); 



.min{w{if, l},e'); 



^\, \Cj\,mm{w{i)w{j), l},e') 



w{u)w{v) ' 



Theorem 9. Sample-G(wi, w„,e) generates a graph from a distribution n' that is e-close to ir in 
expected time O (/i logn log(g/e) (logn + q) + q^). Moreover, for any constant c the probability that 
the running time exceeds its expectation by a 2c logn multiplicative factor is at most 0{n~^). 

Proof. The normalization happening in Phase 4 ensures that the rounding in Phase 1 has no net 
effect on the distribution the algorithm samples from. This may not be immediately obvious, since 
the edges of the graph constructed at the end of Phase 3 were not the result of fully independent 
sampling. However, it is tedious but straightforward to bound the probability of a graph GiV, E) 
being the output at the end of Phase 3. That is. 



W WuWv J]^ (1 
{u,v}gE {u,v}^E 



Wuw,) < Fi[G{V,E)] < 



Yl{u,v}eE '^uWv Y[{u,v}<^eO- 



WuWy) 



Phase 1 partitions the vertices into k classes. Let iTij be the probability distribution according 
to G{n,w) over subgraphs in edge class [Gi,Gj]. In Phases 2 and 3, Theorem 8 guarantees that, for 
each i < j, the algorithm samples from a distribution tt'- j which is e-close to vTjj. 

Let X be the set of all graphs from which n' does not sample, and for each i and j let Xij be the 
set of subgraphs from which ir'^j does not sample. To sample from vr', the algorithm samples once 
from each ir'^j. Therefore, G G X if and only if there exists a subgraph H of G such that H G Xij for 
some i and j. Using union bound. 



=1 j-- 
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Let G be any of the possible output graphs of the algorithm, and let Hij be the subgraph of G 
induced by the vertices in classes Q and Cj. By Theorem 8, 



k k 



k k 



k k 



<G) = nn ^^^■(^^^■) = nn - ^^^^x^,)) < nn = ^'(^) 

i=l j=i 1=1 j=i i=l j=i 

Hence, max5cg„ |vr(5) - 7r'{S)\ = max^cg^ |vr(5 f] X) + 7r{S \X)- 7r'{S f] X) - 7r'{S \ X)| < e. 

To analyze the running time, let Ti, T2, T3 and T4 be the running times for each of the four phases 
in the algorithm. Then, Ti = 0{qn). Let w{i) be the rounded up weight and \Ci\ = rii for each i. 
By Theorem 8, 



.4 = 1 

k k 



min{w;(z) , 1} ■ logra ■ log(e ) (logn + q) 



O j niUj mm{w{i)w{j), 1} ■ logn ■ log(e') ^ • (logn + q) 

J=l j=i+l 



Thus, T2 + T3 = O (/i' lognlog(e')^^ (logn + q) + k"^) where /i' is the expected number of edges of 
the output graph using the rounded weights. Given that T4 = 0(g/i'), /i < yu' < and k = 0{q), 
the total expected running time of the algorithm is O (/ilognlog(g/e) (logn + q) + q^) . 

Finally, we observe that Markov's inequality implies Pr [T > 2E[T]] < 1/2. Considering the worst 
case where each group of clogn steps is an independent experiment, we get Pr [T > 2clognE[T]] = 
0{n~'^). This completes the proof of Theorem 9. □ 



4 Efficient Generation e-close to G{n^ W) 

In G(n, W), the input consists of an n x d matrix W containing n vectors: W = [wi, ...,Wn), where 
each vector u4 ^ corresponds to a vertex u. The probability of each edge {u,v} is given by 
mm{{wu,Wv),^}, independently from all other edges. Throughout this section, and by analogy to 
previous sections, let fi = E[\E\], vr be the distribution over Qn according to G{n,W), and q = 
max(| logmax^j{||i4||}|, | logmin„{| |}|). 



SampIe-G(W^,e) 
%Phase 1: Rounding 
for M = 1 to n 

Liu) ^ \lTl=iKk^ 

%Phase 2: G(n, L, e) 
E Sample-G(L,e); 
%Phase 3: Normalization 
foreach (e = {u,v} E E) 

E = E\{e} w.p. l-^M^M. 
return (E); 

The main idea of the algorithm is to reduce sampling from G{n, W) to repeated sampling from 
G{n, w) using properties of the inner product to round and normalize. The reduction uses the fact 
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that (wu.Wv) = \\wu\\\\wv\\cos{ip{u,v)) where ip{u,v) is the angle between vectors Wu and Wy. In 
Phase 1, Sample-G(Vr, e) assigns to each vertex u a weight equal to its length = ||wm||- In Phase 
2, the algorithm calls Sample-G(/i, ...,/„, e') with a suitable choice of e' we shall determine. Notice 
that at this point, each edge {u, v} has been sampled with probability | \wu\\ \ \wy\ \ instead of {wu, Wy) . 
In Phase 3, the algorithm normalizes the output graph (usual accept-reject). 

Theorem 10. Samp\e-G{W, e) generates a graph from a distribution vr' that is e-close to vr in 
expected time O {dfi log nlog{q/e) [log n + q) + q"^) . Moreover, the probability that the running time 
exceeds its expectation by a 2clogr;, multiplicative factor is at most 0{n~'^). 

Proof. The normalization in Phase 3 ensures that the rounding in Phase 1 has no net effect on the 
distribution the algorithm samples from. Thus, by Theorem 9, vr' is e-close to tt. In Phase 1, one 
could round up each L{u) to an even power of two simplifying the calculation of the square root with 
no effect in the running time (this step was left out of the pseudocode for clarity). Therefore, Phase 
1 takes 0{qdn) time. If fi' is the expected number of edges of the output graph prior to Phase 3, 
then. Theorem 9 guarantees that Sample- G ( VT, e) runs in O (yu'lognlog(g/e)(log?i + q) + q^) time 
on expectation. To complete the proof, we use the following Lemma, 

Lemma 11. dfi > fi' 

Proof. For all u and v for which (wu^Wy) > 1, the generated subgraph graph will be complete and 
can be obtained trivially in 0{qnlogn). Hence, without loss of generality, we may assume that 
{wu, Wy) < 1 for all u and v. 

n n n n 

Oberve that /i = ^ ^ u^) and yu'= ^ ^ When u = t;, (i/7ti, w^) : 

«=1 v=u+l u=l v=u+l 

thus, we can show instead, 



\Wy 



n n ^ n n 



\Wu\\\\Wy\ 



u=l v=l 



u=l v=l 



Let Wuk denote the components of Wu for each u. Then w^k = luCos{ipuk) where ipuk is the 
angle between Wy and the k-th dimension's axis. From the definition of inner product follows that 

d 

cos{ip{u,v)) = J2 cos{ipuk) ■ cos{ipyk). Therefore, 
fc=i 



n n 



n n 



u=l v=l 



^cos(v3ife) ■ cos(v3. 



'jk) 



u=l v=l \ k=l 

d ( n ^' " 
^ ^lu- COs{LPuk) ^ Iv ■ COs{Lpyk] 

k=l \ u=l \ v=l 



k=l 



^ 2 



^ "^lu- COs(v9„fc) 



^ u=l 



Using Jensen's inequality for the first bound, and repeated triangular inequality for the second. 



fc=i 



/i = ^ ■ cos(v3„fc) 



u=l 



> 



d n 

J2 J2^u- COs(v3„fc) 
^ k=l u=l 

d 



N 2 




2 

n n 








^ > 




u=l v=l 



d 



□ 



Thus, Sample-G(iy, e) runs in O ((i/ilognlog(g/e)(logn + q) + q^) expected time. The high proba- 
bility statement also follows immediately from Theorem 9. □ 
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5 Efficient Generation e-close to G{n^ V) 



In G{n,V), the input consists of a positive integer k and a dx d "initiator" matrix V={6ij). Define 
Kk recursively using the Kronecker product of matrices [17] . 

/0iiKk-i OuKk-, ... e,A-i\ 
Kk = V®Kk^i= \ \ ■■. \ 

\OnlKk~l 9n2Kk-l . . . 9nnKk-l J 

where Kq is the identity. In G{n,V), the probability of an edge {u,v} is given by mm{Kk{u,v), 1}, 
independently from all other edges. Throughout this section, n = d^ is the number of vertices of 
the output graph, /i = E[|i?|], tt is the distribution over all graphs on Qn according to G{n,V), and 
q = max(| logg maxij{%}|, | logs 

The main idea of the algorithm is to reduce sampling from G{n, V) to repeated sampling from 
the binomial distribution using the method in Section 2. Sample-G(?T,, P, e) partitions the edges 
according to their probability of occurrence. The probability of occurrence for each edge e is given 
by pe = 6162. ..Ok where each 6i is an entry of V. Hence, pe = ^11^ ■■■^m'^ ioi a suitable sequence of 
Ojj's. The number of edges with probability of occurrence pe is Ne = ^^^1^'^ — j. Therefore, there is a 
natural two-step approach to generate the random subgraph from each edge class. First sample \E\ 
from B{Ne,pe), using the method described in Section 2, and then generate a random combination 
of I -El out of Ne possible edges, using for example the classic method in . 



Sample-G(r2, V, e) 

e' ^e/k''"; 

foreach ((an, ai2, a^d) G Z'^^ : J2i=iY^'j=i(^ij = k and aij>0 Wi,j] 



N ^ 



aii!ai2!...add! ' 
d d 



1=1 j=i 



M ^ Sample-B(A^,p,e'); 
{ci, ...,cm} ^ Random-Combination(A^, M); 
for i = 1 to M E ^ EU {e^J; 
return (E); 



The pseudocode may hint to ineficient arithmetic operations, but it is not the case. Establishing 
two suitable orders, one among edge classes and one among edges within each class, we can find the 
next edge class easily in 0{d^) time. Similarly, searching for the Cj-th edge in an edge class can be 
done in 0{d'^). Therefore, finding the edges corresponding to the random combination {ci,...,cm} 
takes 0{d'^M) time. 

is initialized to fc!, and it is recomputed in each step with only one single multiplication opera- 
tion. Computing p involves d'^ multiplications per step, and using standard multiplication algorithms 
this can be done in 0{d'^k'^ log^ q). Therefore, we have a very crude bound of 0{d^k'^ log^ q + d'^M) for 
the cost of arithmetic and auxiliary operations per step, where, in practice c? is a very small constant 
[Tt] . we have not tried to optimize the exponent of d for the sake of clarity. 

Theorem 12. Sample-G(n, P, e) generates a graph from a distribution tt' that is e-close to vr in 
expected time 0{d'^iJi\og{e~^)mayi{\ogn.,\ogq} + d'^ {log nY^ ^"^ q). Moreover, the probability that 
the running time exceeds its expectation by a 2c log n multiplicative factor is at most 0{n~'^). 

Proof. The proof is very similar to the proof of Theorem 9. The details are in Appendix 4. □ 
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Appendix 1: Proof of Fact 3 

Fact 3. For any starting state (or probability distribution) Xq G /, Xt converges to Bi\{N,p). 

Proof. It is obvious that Xt is ergodic. Convergence to B/^{N,p) follows by verifying detailed balance 
conditions. For k in the range /i+l < k < fl+A'^, detailed balance conditions are equivalent to: 

B{k-l;N,p)FT[Xt+i = k\Xt = k-l] = B{k-l;N,p) a+{k-l) 

^! ^.fc-in .N-k+iN-k + 1 p 



{k - i)\{N - k + ly/ ' k 1-p 



{k)\{N-k)V 
= B{k;N,p) 
= B{k;N,p) a-{k) 
= B{k;N,p)Fi[Xt+i = k-l\Xt = k] 
For k in the range /i — < k < fi, detailed balance conditions are equivalent to: 
B{k;N,p)FT[Xt+i = k-l\Xt=k] = B{k; N,p) a- (k) 

^! ,_^A^-fc k 1-p 



{k)\{N -k)r ' N-k + 1 p 



(A; - l)!(iV- A; + 1)!' 
= B{-l;N,p) 
= B{k-l;N,p) a^{k-l) 
= B{k-l;N,p)FT[Xt+, = k\Xt = k-l] 

This completes the proof of Fact 3. □ 

Appendix 2: Proof of Lemma 6 

Lemma 6. For some sequence {at}, Zt = L — (Xj+i — Yi+i). 

Proof. For this, we reduce the characterization of ([5]) to Cases 1, 2 and 3 below. In each case, ((|8]), 
Q and ([lOD), we show that (ED is of the form (Q. 

Case 1: If Xt > Yt > /i, then a~{Xt) = l, and letting Yt = k, Xt = k+x, x> 1, we have 

r {Xt-Yt)-i w.p. + i 

- Yt^i) = { ixt- Yt) + 1 w.p. 1 ^ + 

[ (x,-r,) + o w.p. \-\m + ^)^,^\---^ 

Realize that > ^"^"j"' and a~{Yt) < 1. Thus moving probability ^ - ) ^ from the 

—1 level to the level, and moving probability | — " from the level to the +1 level, we may 
bound 




(8) 



Case 2: If Xt > fi > Yt, then a'^(Yt)=a {Xt) = l, and we have 
i^t+i - yt+i) = 



{X,-Yt)-1 w.p. i 
{Xt-Yt) + 1 w.p. 



+ 



a+{Xt) 



2 

(At - Yt) W.p. 2 4 4 



Moving 1/4 probabihty from the —1 level to the level, and | 
level to the +1 level, we may bound 



° ^^"^ — " ^^^^ probability from the 



{Xt-Yt)-l w.p. \ 

(x^+i - y^+i) <{ {Xt- Yt) + 1 w.p. i 



(9) 



{Xt-Yt) w.p. i 
Case 3: If /i > > y^, then a+(F() = l, and letting yi = A;, Xt = A; + x, x>l, we have 



Yt 



i+l; 



(X,-F,)-l w.p. + i 

{Xt-Yt) + l w.p. + ^ 



{Xt-Yt) + Q w.p. 



Af-fc-x+l 



N-k+ 



1/ p ^ 4 



4 



and a'^{Xt) < 1. Thus moving probability | (- 



k N ]-p 



Realize that jy-t-x+i > lv=fc+i """"" " v-^t; - - — ^ V7v-fc-x+i Af-/t+i 

from the —1 level to the level, and moving probability ^ — " ^^''^ from the level to the +1 level, 
we may bound 



t+i 



Yt 



t+i, 



< < 



iXt-Yt)-l w.p. Hl+(^)if 
iXt-Yt) + l w.p. Hl+(^)i^ 
{Xt-Yt) + w.p. : 



fc 1— p 



(10) 



□ 



Appendix 3: Proof of Lemma 7 

Lemma 7. max E[mini{Zt = L}] < 2(L + 1)^ 

at G ^ 

Proof. The proof is a suitable adaptation of the proof for random walks on the integer line with 
reflecting barrier at zero. For /c > 0, we argue inductively that 

f{k + 1) = max max E \mm{Zt'+t = k + l\Zt' = k}] < A{k + 1) (11) 

The base case /(I) < 4 is obvious. For the inductive step, the recursive definition in ([6]) and the 
definition of f{k + 1) in f llip imply that for some 0<q;<1 

/(A: + l) <i(l + a) + i(l-a)(l + /(fc + l)) + i(l + a)(l + /(fc) + /(A; + l)) 
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or equivalently, 



1 + a 

< 4 + /(A;) 

< 4 + 4/c (by the inductive hypothesis) 
= 4{k+l) 

thus estabhshing [11] Combining ([7]) and f|TT]) we get 

A++A- 

max E[mm{Zt = (A~^ + A^)] < max max E m.m{Zt/^t = k + l\Zt' = k} (12) 

fe=i 

< 2(A++A- + 1)2 (14) 

□ 



Appendix 4: Proof of Theorem 12 

Theorem 12. Sample-G(n, "P, e) generates a graph from a distribution it' that is e-close to it in 
expected time O ^(i^/ilog(e~^) max{logn, logg} + d'^(\ognY^~^'^log^ Moreover, the probability that 
the running time exceeds its expectation by a 2c log n multiplicative factor is at most 0{n~^). 

Proof. Taking e' = e/Zc'', we may follow the exact same steps as in the proof of Theorem 9 to 
show that vr' is e-close to vr. In each step, Sample-B(A^, p) and Random-Combination(A^, M) are 

called for a total expected running time of O ^/i log(/c'^^/e') max{logn, logg}^ , using Theorem 5 and 

the method to sample random combinations discussed in [3]. The cost of arithmetic and auxiliary 
operations per edge class is 0((i^A;^ log^ g), and there are k'^'^ classes with k = 0{logn). Therefore, 

the overall running time of the algorithm is O (^d^^ log(e'~ 1) max{log n, log 9} + d'^(\og nY^^"^ log^ . 
Notice that some log log n factors are omitted, and, as mentioned before, (i is a very small constant. 
The proof of the high probability statement for the running time goes exactly as in Theorem 9. □ 
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